Speech prosody enhances the neural processing of syntax

Human language relies on the correct processing of syntactic information, as it is essential for successful communication between speakers. As an abstract level of language, syntax has often been studied separately from the physical form of the speech signal, thus often masking the interactions that can promote better syntactic processing in the human brain. However, behavioral and neural evidence from adults suggests the idea that prosody and syntax interact, and studies in infants support the notion that prosody assists language learning. Here we analyze a MEG dataset to investigate how acoustic cues, specifically prosody, interact with syntactic representations in the brains of native English speakers. More specifically, to examine whether prosody enhances the cortical encoding of syntactic representations, we decode syntactic phrase boundaries directly from brain activity, and evaluate possible modulations of this decoding by the prosodic boundaries. Our findings demonstrate that the presence of prosodic boundaries improves the neural representation of phrase boundaries, indicating the facilitative role of prosodic cues in processing abstract linguistic features. This work has implications for interactive models of how the brain processes different linguistic features. Future research is needed to establish the neural underpinnings of prosody-syntax interactions in languages with different typological characteristics.


Introduction
Human language is based on hierarchically structured syntax, allowing for robust and efficient communication. This structure has an important role in speech comprehension, as it lays down a set of rules, our syntax, over which infinite number of utterances can be created. Thus, the correct processing of syntactic information becomes fundamental for successful communication across speakers. Aside from syntactic rules, different abstract sources of information can change the way these utterances are processed (i.e. semantic content, context, predictions based on statistics, etc), but research in the last twenty years has shown that syntax can be also tightly linked to a more physical aspect of speech: prosodic information 1,2 .
The suprasegmental information carried by prosody has multiple acoustic markers that can be used to orient the interpretation of a sentence. These acoustic markers can include variation in pitch or in the energy of the speech signal as well as lengthening of syllables or the insertion of long pauses. For example, the sentence "Johnny the little boy is playing" can be parsed with two different syntactic structures based on the temporal location of the acoustic cues. The lengthening of the syllable boy and the placement of an additional pause after it can differentiate between two syntactic structures: one where the sentence is directed at someone called 'Johnny' (i.e., "Jonny, the little boy is playing"), and one where 'Johnny' is the subject of the verb phrase (i.e. "Johnny, the little boy, is playing") and the noun phrase 'the little boy', is an apposition. This example shows how prosodic boundaries can dramatically change how words need to be grouped together in order to successfully communicate concepts. This particular feature of prosodic boundaries can be seen as one of the most evident connections between prosody and syntax as it highlights, for example, a close correspondence with grammatical phrase boundaries 1,2 . Indeed, the redundancy between these two linguistic features might be integral for facilitating the mapping of the incoming speech signal onto a syntactic structure, and disambiguating between two alternative parses.
. The significant role of prosody in ontogeny is consistent with its posited role as a precursor of language 10 , coming from work showing that newborn babies can already distinguish languages coming from the same rhythmic class as their environmental language from those coming from different rhythmic classes 10,11 . In turn, such a developmentally early sensitivity to prosody likely arises from prenatal processing and learning 12,13 , thanks to an auditory system that is established and functional in the last 4-6 weeks of gestation [14][15][16] . Indeed, prenatal exposure to speech and other acoustic signals is more faithful to slower-changing acoustic information such as that contained in prosody, due to low low-pass filtering of sound by the amniotic fluid [17][18][19][20] .
Although the results in children show that syntactic learning becomes less reliant on prosody later in life 21,22 , there is evidence of a continuous interaction between these two features even in adults. This is likely due to the fact that prosodic signals can be a reliable acoustic and perceptual scaffold for the online processing of syntactic features 23 . Indeed, it has been shown that prosodic cues such as pitch variation and syllable duration can assist syntax learning more than transitional probabilities 24 . Prosody also helps the initial parsing of sentences 25,26 , morphosyntactic dependency learning 7 , and speech segmentation 7 , in adults.
Biological evidence for a prosody-syntax interaction further supports the importance of prosody for more reliable syntactic parsing. Brain imaging research using electroencephalography (EEG) has shown that changes in pause, pitch, loudness, or duration that mark boundaries between phrases, known as prosodic boundaries 7 , affect neural markers of syntactic disambiguation. For example, it has been found that manipulation of the placement of prosodic boundary modulates the evoked responses to the stimuli such as to assist syntactic disambiguation 27,28 and processing of garden path sentences [29][30][31] . Conversely, research has also shown the influence of syntactic predictions on prosodic processing. For instance, it has been shown that a neural marker of prosodic processing -the closure positive shift 32can be elicited by biases generated from initial syntactic parsing preferences, even in the absence of explicit acoustic cues 33 . It is thus possible to hypothesize that the regular interaction between and co-occurrence of these two linguistic features has driven the brain into expecting syntax and prosody to be coherent even if one of the two is not explicitly present, or when they don't co-occur. In line with this, a recent study has demonstrated that the cortical tracking of naturalistic speech is enhanced when overt and covert prosodic boundaries are aligned with the syntactic structure compared to when they are misaligned 34 .
Very few fMRI studies have been done on the interaction between prosody and other, more abstract levels of linguistic information. Seminal studies on the processing of linguistic prosody per se involved comparing different types of filtered speech. These studies showed mixed results in terms of hemispheric lateralization, with some having identified right 35,36 but also some left 37 hemispheric involvement of temporal and frontal-opercular regions in relation to the modulation of pitch (i.e. intonation) during speech processing. The lateralization likely depends on which information is being extracted from the signal or on the control condition [38][39][40][41] . The processing of higher-level linguistic features such as syntax is known to be more left-lateralized involving regions such as the inferior frontal gyrus (IFG), and middle temporal gyrus [42][43][44] . The few studies that have focused on the question of prosody-syntax interactions using fMRI showed, for example, that in tonal languages (here: Mandarin), there are shared activations in the left but not right inferior frontal gyrus IFG between intonational (question vs. answer) and phonologically relevant tonal (word category) processing 45 . Another investigation similarly reported a role of the left IFG in processing linguistically relevant intonational cues; this region's activation was modulated when those cues played a crucial role in sentence comprehension 41 .
Although research has shown neural evidence for prosody-syntax interactions, more work is needed to fill the knowledge gap regarding how the brain processes these two linguistic features jointly in more naturalistic processing contexts. Most of the studies to date, especially those having used EEG, have exploited neural signals or activity emerging from the violation or disambiguation of prosodic or syntactic signals. In the context of EEG studies, this has allowed us to assess the neural markers of time-locked analyses 46 , but has limited implications regarding the neural processing of syntactic and prosodic information during naturalistic speech perception. Moreover, controlled studies that make use of expectations or violations can be highly affected by error detection or other processes that are not central to syntactic judgments, again making it difficult to draw conclusions regarding the processing of prosody-syntax interactions per se 30 .
In contrast to what can be assessed with well-controlled experimental paradigms, human communication is based on a very complex, multidimensional signal that spans different sensory modalities. Hence the use of tightly controlled experiments and reductionist, task-based paradigms fall short of an ecological characterization of the neural processes underlying naturalistic language processing 47 . The importance of the generalization of neuroscience research has gained attention in recent years, and the last decade has seen increased use of narratives in experimental designs, to simulate natural speech perception. For this, brain activity is measured during passive listening, to closely simulate the way that speech is processed in day-to-day contexts. In such studies, control over the effect(s) of interest can be applied after data collection, at the data analysis stage, thus increasing the flexibility and ecological validity of the experiment 48 . A driving force underlying this paradigm change is the increasingly flexible and powerful computational tools that are now becoming available to model the multidimensional, hierarchical, and interactive nature of human speech. Indeed, the use of machine learning techniques to statistically assess (and control for) different linguistic features has led to a new line of research that takes advantage of state-of-the-art (linear or nonlinear) computational models of cognitive or neural function 49 . Importantly, such model-based encoding and decoding approaches allow results to be interpretable in light of proposed underlying cognitive and/or neural mechanisms 50 . Recent studies have already shown promising results in using linearized modeling approaches for pinpointing the neural signatures underlying the processing of semantics 51-53 , syntax [54][55][56] , phonetics 57,58 and acoustics [59][60][61] . These computational models of the different levels of the linguistic hierarchy have however often been investigated in isolation, and very few studies have looked at their possible interactions 52 .
In the present study, we made use of naturalistic stimuli together with model-based decoding 49 to fill a crucial knowledge gap in how the cortical representation of syntax can be modulated by prosody in order to facilitate speech processing and overcome structural ambiguity. We used state-of-the-art machine learning techniques to compute the syntactic dependency structure serving to express functional relationships between words, and the prosodic boundaries of a speech corpus (TED talks).
While formally equally expressive 62 , dependency grammars differ from phrase structure grammars because they do not posit unobserved constituents and create hierarchical syntactic representations entirely based on dependencies between words. In this respect then, dependencies are a more parsimonious representation, as they introduce only a minimal set of assumptions. Moreover, previous work has shown that after controlling for structural distance (which does not distinguish phrasal from dependency information), phrase structural predictors do not significantly contribute to an explanation of activity in the temporal lobe 63 . In fact, while dependency structures have been less explored than phrase structure grammar in neuroimaging, recent research has revealed the cognitive significance of looking at some of the operations underpinning the functional relationships among the words of a sentence, particularly left-side dependencies 64 . In this latter study, neural activation underlying two syntactic features that operationalize phrase and dependency structures were compared: closed phrase structures and left dependencies. Importantly, the left dependencies not only activated highly important areas of the language network such as the left anterior temporal lobe, but it was even found that left inferior frontal activation was more specific to dependency than to phrase structure. These results thus corroborated the existence of a cortical network that supports the representation of the incremental build-up of dependency relations. Finally, the relevance of left dependencies for more complex syntactic operations is confirmed by many pieces of work in psycholinguistics based on dependency relations 65 and by work showing the interchangeability between dependency and phrase structure grammar [66][67][68] .
Here, we first tested whether the strength of the prosodic boundaries present in the speech data was modulated as a function of left-side dependency when producing natural speech (TED talks). Then, in an existing magnetoencephalography (MEG) dataset of participants listening to the same TED talks, we investigated whether the neural encoding mirrors the prosodic-syntactic interaction we identified in the speech signal. Such a finding would provide biological evidence for 'prosodic boosting' of syntactic encoding in the adult brain, reflecting a neural mechanism that allows prosodic cues to increase the robustness of syntactic representations, and thus of successful speech comprehension.

Stimulus spectrogram analysis
We first analyzed the TED talks spectrograms, to understand if the speakers used different prosodic cues in correspondence with the syntactic dependency structure of their utterances. These results allowed the formation of three different combinations of words that could then be evaluated in the brain decoding. The first one comprised a group of words with a weak prosody boundary, called here 'neutral', that had either the presence or absence of a left dependency. A second group included words without a left dependency and a weak prosodic boundary, as well as words with a left dependency and a strong prosodic boundary, called here 'coherent' as it included the most statistically probable combination based on the stimuli analysis. Vice versa, the third and final group was composed of words with a left dependency and a weak prosodic boundary and words without left dependency and a strong prosodic boundary. This latter combination was called 'incoherent', since it had the least probable association of prosody and syntax given our analysis of the speech data. Crucially, the choice of the three sets allowed us to assess (1) the contribution of prosodic boundary strength into the encoding syntactic operation in the brain and (2) if the interaction between syntax and prosody that was seen in the stimuli was also mirrored within the cortical representation of the participants (see neural analysis).

Neural analysis
Our analysis of brain activity involved the multivariate decoding of syntactic information from brain activity using a combination of the mne-Python software and custom code.
The MEG data was divided into smaller trials based on word offsets, and a logistic classifier was trained to map brain activity to the presence or absence of left dependencies. Two types of classifications were performed: one using a classical multivariate pattern analysis (MVPA) thus concatenating MEG data across time and channels (Fig. 3), and another performing a temporal classification over the epoched data (Fig. 4). Importantly, the decoder was trained only on neutral combinations to evaluate the neural processing of syntactic information within a weak prosodic context, thus avoiding any inherent bias that was present in the stimuli. By generalizing the performance of the model across the neutral, coherent, and incoherent conditions, it was then possible to quantify and assess differences in the strength of syntactic representations in the brain under various prosodic conditions.
MVPA decoding First, we tested whether the neural encoding of left dependencies was modulated by prosodic strength, in line with our stimulus analysis. Specifically, we test the hypothesis that syntactic dependency classification performance is highest in the "coherent" condition, where the presence of left dependency is coupled with strong prosodic cues.
In the MVPA analysis of the pre-offset time windows (

Temporal decoding
The results of the temporal decoding analysis likewise showed significant encoding of syntactic dependencies in all three test sets (Fig. 4, upper row). The permutation test in the coherent condition revealed a cluster that spanned across both pre and post-word offsets,  , Incoherent (blue -B) and neutral (yellow -C) sets. All three test sets performed above chance, but both decoding accuracy and the significant time windows differed across these sets. The performance of test sets was then directly compared (D and E), to identify the time windows during which the coherent set performed better than the incoherent and neutral ones. We found that in the post-stimulus time window of 100-400 ms, the decoding of syntactic dependencies in coherent condition (i.e. containing trials that had a combination of prosodic and syntactic features that was consistent with the pattern that was most frequently detected in the stimuli) was significantly stronger than for left dependencies with weak or mismatched prosodic boundary strength.\ Horizontal red lines represent significant results under cluster-based permutation tests.
The purpose of this study was to shed light on the interaction between the processing of prosody and syntactic dependencies in the adult brain. We aimed at uncovering their relation within utterances produced by different TED talk speakers and within the neural representation of syntactic dependencies. We tested the hypothesis that the brain processes syntactic dependencies differently depending on the prosodic structure of the sentences; this would lend support to the idea that prosodic information is used by the brain to facilitate, and disambiguate, the representation of syntactic relationships during speech processing.
In our stimulus analysis, we showed that speakers modulate their speech output depending on the intended syntactic structure. More specifically, we showed that stronger prosodic boundaries were more likely to be placed on words that had a syntactic left dependency relation. Conversely, the speech utterances had a relatively weaker if not absent prosodic boundary for words that did not have a preceding functional relationship.
These results are in line with known linguistic relationships between prosody and syntax in speech, based on evidence that the placement of stress or boundaries in intonational phrases can reflect the syntactic structure of the sentence 1,2 . Our findings support the presence of an interaction between the closure of functional relationships between words and prosodic cues, likely reflecting a mechanism whereby speakers, whether consciously or not, modulate their speech output to help the listener focus on the more salient syntactic relationships between words in order to facilitate comprehension.
Previous such evidence for prosody-syntax relationships has, however, not been investigated in speech corpora using fully automated, large-scale language models for the extraction of prosodic and syntactic features like the ones adopted here 69,70 . Instead, most of the research exploiting these powerful language parsers has been used for the extraction of prosodic and syntactic regularities from large amounts of data to produce better text-tospeech generators 71,72 or to improve the quality of syntactic parsing 73 . Deep learning approaches can indeed be useful as they indirectly demonstrate a statistical relationship between prosody and syntax, but only a few of these studies have shown links that are linguistically interpretable 74 . Here, we used data-driven language models to allow the automated extraction of well-characterized linguistic features from a continuous, naturalistic speech signal, and as such our work contributes to a more focused and quantitative analysis of the interface between prosody and syntax than has been demonstrated to date.
In the MEG data analysis, we asked whether the neural processing of syntactic information is modulated by prosody, when listening to the same TED talks. We first trained a decoding model that was free of prosodic biases to learn the mapping of the syntactic dependency information in the neural data. Then, during generalization testing, we Our temporal decoding analysis revealed interesting qualitative patterns before and after word offsets. In all conditions, decoding performance was above chance even before word offsets. This suggests the presence of predictive mechanisms, whereby the brain is anticipating the closure of functional relationships between words before the words are completely heard. In the coherent condition, significant decoding lasts up to 400ms after word offsets, suggesting an additive contribution of acoustic cues (i.e. presence of prosodic boundary strength) to the higher order processing underlying speech comprehension following syntactic closures 75 . In other words, the presence of prosodic information appears to promote longer-lasting neural processing of syntax, possibly arising from reverberating top-down and bottom-up processing during comprehension. Interestingly, the brain decoding of the words that contained mismatching syntax-prosody combinations (i.e. left dependency but weak prosodic boundary strength, or no left dependency and strong prosodic boundary strength) (i.e. incoherent conditions) showed above-chance performance until word offset, but not afterward. We speculate that this shorter-lasting neural signature of syntactic processing may be due to possible interference arising from syntax-prosodic combinations that are divergent from the statistical regularities to which our brain is normally exposed.
Similar results have also been found in ERPs studies, where it has been shown that superfluous prosodic breaks (i.e. ones that don't coincide with syntactic breaks) disrupt sentence processing (i.e. comprehension) more than missing prosodic breaks (i.e. absence of prosodic break at the position of syntactic break) 29,30 .
Our decoding analysis of MEG data limits our ability to localize brain regions involved in the prosody-syntax interface. Previous studies looking at the brain networks modulated by linguistic prosody per se have shown the involvement of regions within the ventral and dorsal pathways spanning the superior temporal lobe and sulcus, premotor cortex, posterior temporal lobe and IFG, often biased towards the right hemisphere 38,79 but found also in the left, or bilaterally [80][81][82] . Interestingly, two of these areasthe posterior temporal lobe and IFG have historically been considered to also play a crucial role as hubs for syntactic processing [83][84][85] . It can be thus hypothesized that the posterior temporal lobe and the inferior frontal gyrus (IFG) may be possible loci for the interface between prosody and syntax.
Indeed, recent findings seem to indicate the involvement of the left IFG in cases where intonation is used to establish sentence structure 41,86 . Further investigation on the role of linguistic prosody, especially in interaction with syntax, remains to be conducted. This work could for example employ functional connectivity analyses, to shed light on how the respective networks interact to support synergistic interaction between different levels of (para-)linguistic processing. This research ought ideally to also integrate more complex models that can disentangle the amount of information conveyed by prosody and its potential to benefit the structuring of human speech. For example, extending our computational approach to data with higher spatial resolution, such as fMRI, could uncover the neural architecture that exploits prosodic information to enhance the cortical representation of syntactic structures.

MEG data acquisition and preprocessing
We made use of an existing MEG dataset obtained from 11 participants in previously published work (20-34 years; 5 female) 57 . The MEG signals were recorded using a 275channel VSM/CTF system, with a sampling rate of 2400 Hz and low-pass filtered at 660 Hz.
Independent component analysis was used to remove artifacts due to eye movement and heartbeats. The MEG signal from each channel was subsequently visually inspected to remove those containing excessive broadband power spectra. For the coregistration between anatomical data and the MEG channels, the head shape of each participant was digitized using Polhemus Isotrak (Polhemus Inc., USA), based on T1-weighted structural MRI data (1.5 T, 240 x 240 mm field of view, 1 mm isotropic, sagittal orientation).
Forward modeling was computed on the MEG data using the overlapping-sphere model 87 . Noise normalized minimum norm operator was used to estimate the inverse solution 88 . Further, singular value decomposition was performed on the resolution matrices defining the source space. For each session and subject, this operation produced a set of M singular vectors reflecting an orthogonal space on which the coregistered data was projected. By taking the 99% cut-off of the singular value spectrum, we defined a lowdimensional space Xc over which the decoding analysis was computed. For further details regarding the data preparation, see 57 .

Speech stimuli
The The speech stimuli were forced-aligned to the TED talk text, extracted from the online transcripts using the Montreal forced aligner (MFA, 90 . The automatic alignment was implemented with the LibriSpeech corpora 91 , available as an MFA model. The output of the alignment was examined, and corrected manually using the Praat software 92 . Further stimulus adjustments were done to optimize the following MEG analysis (see Decoding section). First, out-of-dictionary tokens were labeled as <unk> and removed from the set. Second, to avoid positional confounds due to end-of-sentence effects, the last word of each sentence was also removed.
The complete preprocessing of the TED talks produced a stimulus set consisting of 1723 unique words that were subsequently used in the stimulus and MEG data analysis. In the second step, we computed a spectro-temporal analysis of the audio files to extract the word duration, the energy, and the f0 information. The time courses of these three features were then combined and analyzed via wavelet transform. The spectrogram obtained after this last step was used to evaluate the maps of minimal power around word offsets to assess the prosodic boundary strength associated with each word. This unsupervised modeling of prosodic information was computed using the Wavelet Prosodic Toolkit (https://github.com/asuni/wavelet_prosody_toolkit), which has been shown to perform as well as gold standard approaches for the annotation of prosodic boundaries 69 . Words that were automatically tagged with prosodic boundary strength equal to 0 were discarded from further analysis as they did not contain meaningful spectral content. The prosodic distribution of the data was subsequently modeled using a two-Gamma Mixture model to cluster the complete stimulus set into two latent distributions. This step allowed us to classify the prosodic content of each word within the binary categories of weak and strong prosodic boundary strength.
The final feature matrix Y describing our stimuli was thus comprised of a syntactic feature characterizing the words as having or not having a left dependency, and of a prosodic feature characterizing the words as having a weak or strong prosodic boundary strength ( Fig. 1).
Assessing the prosody-syntax interaction in the stimuli To assess the interaction between prosodic and syntactic dependency information in the stimuli, we first separated all the words present in the dataset based on the weak or strong prosodic boundary strength content. We then compared these two groups regarding their percentage of words that did or did not have a left-side dependency relation. This allowed us to examine whether or not prosodic boundary strength was modulated in relation to syntactic dependencies during the production of the selected TED talks.

Model-based decoding of syntax in the MEG data
The multivariate decoding of syntactic information from brain activity was implemented using a combination of the mne-Python software (version 1.0, https://github.com/mne-tools/mne-python) and custom code. The low-dimensional MEG data Xc was separated into smaller trials by extracting the brain activity before and after each word offset. Trials with an inter-trial offset interval of less than 100 ms were discarded.
Following that, a logistic classifier was trained to learn a linear mapping between the brain activity and the presence or absence of left dependencies. Two different types of classifications were computed. First, MEG data was concatenated across time and channels in the time windows pre-word and post-word offset to assess the general prosodic boosting, with the prediction that syntactic decoding performance would be modulated by prosodic boundary strength in a way that is coherent with the pattern obtained in the stimuli themselves. Second, a refined decoding analysis of each time point between 400ms preoffset and 800ms post-offset was done, to assess finer temporal modulation patterns of the cortical processing of left dependencies.

Training and test data sets
The data to be used for training versus testing was selected based on the data-driven modeling computed on the word stimuli. Following the analysis of the stimuli (see 'Stimulus analysis' subsection of the Results), the training set was created by selecting only words having a weak prosodic boundary strength, including an equal number of words with and without left dependencies. This allowed us to obtain a classifier that was not influenced by the different levels of prosodic content, and that was syntactically balanced. shown in the stimulus analysis (i.e. in the high separability set). Moreover, this procedure allowed us to quantify differences in the strength of syntactic encoding across the conditions, given that they were balanced such as each include the same number of stimuli with versus without left dependencies.

MVPA of pre and post-word offset
We used a logistic classifier (regularization parameter C = 1) to predict the presence of words with or without left dependencies. This linear mapping was tested by pooling all the brain activity timepoint x channel in two separate time windows: a 400 ms window before word offset and a 400 ms one after word offset. We hypothesized that the MVPA approach would also show that the encoding of left dependencies is modulated by the presence of prosodic information, i.e. that we'll find evidence for prosodic boosting of syntactic processing.

MVPA statistics
We evaluated the performance of the model by assessing the area under the curve (AUC) of the classification against chance (50%) on the test sets. The training and testing were repeated by performing 10-fold randomized cross-validation. The differences across the three conditions and against chance level were assessed in a second-level analysis via paired permutation testing with 2046 permutations (permutation_test of mlxtend package).

Temporal decoding
A single regularized logistic classifier was trained for each time point, and optimized via nested 5-fold cross-validation. The regularization parameter C was selected during nested cross-validation via a log-spaced grid search between 10 -5 and 10 5 .
The best temporal model obtained after the grid search was then tested by estimating the AUC of the classification (sklearn.metric.roc_auc_score) across each of the different conditions described above: neutral, incoherent and coherent.
The AUC results across these three test sets were used for two different contrasts to assess the effects of prosodic boundaries on the decodability of syntactic left dependencies from brain activity. In the first contrast, the coherent condition was tested against the neutral condition, while in the second we compared coherent against incoherent separability.
Critically, while the first contrast allowed us to understand the effect of prosodic content on left dependencies, the second verified that the prosodic boosting is specific to words with left dependencies.

Temporal decoding statistics
Similarly to the MVPA analysis, the temporal decoding results were evaluated using a cluster-based permutation test. The temporal decoding of the three test sets as well as of the two contrasts were extracted for each participant, and tested against chance (50%) in a second-level analysis. To do this we ran a non-parametric cluster-level paired t-test with 2046 permutations ( ) on the AUC time series obtained from each participant (mne.stats.permutation_cluster_1samp_test).

Funding
This research was funded by the NCCR Evolving Language, Swiss National Science Foundation Agreement #51NF40_180888.